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Abstract 


Intermediate network elements, such as network address 
translators (NATs), firewalls, and transparent caches are 
now commonplace. The usual reaction in the network ar- 
chitecture community to these so-called middleboxes is 
a combination of scorn (because they violate important 
architectural principles) and dismay (because these vi- 
olations make the Internet less flexible). While we ac- 
knowledge these concerns, we also recognize that mid- 
dleboxes have become an Internet fact of life for impor- 
tant reasons. To retain their functions while eliminating 
their dangerous side-effects, we propose an extension to 
the Internet architecture, called the Delegation-Oriented 
Architecture (DOA), that not only allows, but also facili- 
tates, the deployment of middleboxes. DOA involves two 
relatively modest changes to the current architecture: (a) 
a set of references that are carried in packets and serve as 
persistent host identifiers and (b) a way to resolve these 
references to delegates chosen by the referenced host. 


1 Introduction 


The Internet’s architecture is defined not just by a set of 
protocol specifications but also by a collection of general 
design guidelines. Among the architecture’s original 
principles [12] are two tenets at the network layer (i.e., 
IP layer) that are still widely valued, but are nonetheless 
often disobeyed in the current Internet: 


#1: Every Internet entity has a unique network- 
layer identifier that allows others to reach it. During 
the Internet’s youth, every network entity had a globally 
unique, fixed IP address. However, the emergence 
of private networks and host mobility, among other 
things, ended the halcyon days of unique identity and 
transparent reachability. Now, many Internet hosts have 
no globally unique handle that serves to direct packets 
to them. 


#2: Network elements should not process pack- 
ets that are not addressed to them. We call this tenet 
“network-level layering”, and it implies that only a 
network element identified by an IP packet’s destination 
field should inspect the packet’s higher-layer fields. 
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This decades-old guideline has become an empty 
commandment, as firewalls, network address translators 
(NATs), transparent caches, and other widely deployed 
network elements use higher-layer fields to perform their 
functions. 


That these tenets are routinely violated is not merely 
an Internet legalism. The inability of hosts in private 
address realms to pass handles allowing other hosts 
to communicate with them has hindered or halted the 
spread of newer protocols, such as SIP [24] and various 
peer-to-peer systems [18]. Layer violations lead to rigid- 
ity in the network infrastructure, as the transgressing 
network elements may not accommodate new traffic 
classes. The hundreds of IETF proposals for working 
around problems introduced by NATs [54], firewalls, 
and other layer-violating boxes are compelling evidence 
that middleboxes (as such hosts are collectively known) 
and the Internet architecture are not in harmony [8]. 
Indeed, because middleboxes violate one or both tenets 
above, Internet architects have traditionally reacted to 
them with disdain and despair. 

We take a different view. Rather than seeing middle- 
boxes as a blight on the Internet architecture, we see the 
current Internet architecture as an impediment to middle- 
boxes. We believe such intermediaries, as we will call 
them, exist for important and permanent reasons, and we 
think the future will have more, not fewer, of them. 

The market will continue to demand intermediaries 
for various reasons. NATs maintain and bridge between 
different IP spaces.! Firewalls and other boxes that in- 
tercept unwanted packets will be increasingly needed 
as attacks on end-hosts grow in rate and severity. Since 
even sophisticated users have difficulty configuring PCs 
to be impervious to attack, we believe that users would 
want to outsource this protection to a professionally 
managed host—one not physically interposed in front 
of the user—that would vet incoming packets. Under 
the current architecture, such outsourcing to “off-path” 
hosts requires special-purpose machinery and extensive 
manual configuration. Intermediaries can also increase 


‘Even if the move to IPv6 accelerates, some IPv4 networks will 
remain. Moreover, private address realms give some protection against 
certain types of network attacks. Hence, we do not think private IP 
spaces are a temporary inconvenience that will soon end. 


performance through, for example, caching and load- 
balancing. Commercial service providers will continue 
to take advantage of such performance-enhancing inter- 
mediaries, disregarding architectural purity. 

Thus, we have a fundamental conflict: although in- 
termediaries offer clear advantages, they run afoul of 
the two tenets above, which causes harm and makes de- 
ploying and using intermediaries unnecessarily hard. Our 
goal, therefore, is an architecture hospitable to intermedi- 
aries, specifically one that allows intermediaries to abide 
by the two tenets, to avoid other architectural infractions, 
and to retain the same functions as today. Such an archi- 
tecture would let a variety of middleboxes be deployed 
while also letting end-system protocols evolve indepen- 
dently and quickly. 

Our architecture—which we call the Delegation- 
Oriented Architecture (DOA)—is based on two main 
ideas. First, all entities have a globally unique identi- 
fier in a flat namespace, and packets carry these identi- 
fiers. Second, DOA allows senders and receivers to ex- 
press that one or more intermediaries should process 
packets en route to a destination. This delegation lets 
the resulting architecture embrace intermediaries as first- 
class citizens that are explicitly invoked and need not 
be physically interposed in front of the hosts they ser- 
vice. Globally unique identifiers and delegation have ex- 
isted in previous work describing different architectures 
(e.g., 13 [57]); see §9 for details. This paper’s contribu- 
tion is defining a relatively incremental extension to the 
Internet architecture, DOA, that coherently accommo- 
dates network-level intermediaries like NATs and fire- 
walls. DOA requires no changes to IP or IP routers but 
does require changes to host and intermediary software. 
However, these changes are modular, and current appli- 
cations can be easily ported. 

We illustrate DOA with two examples: first, 
“network-extension boxes” which are analogous to to- 
day’s NATs in their establishment of private addressing 
realms but do not obscure hosts’ identities, and second, 
“network filtering boxes” which are analogous to today’s 
firewalls but do not violate network-level layering and 
need not be topologically in front of the hosts they ser- 
vice. Our goal is to show how our architecture easily ac- 
commodates these boxes. 


Scope and Limitations 


DOA is based on a subset of the architecture in a pre- 
vious paper [3]. That position paper touches on vari- 
ous issues—including mobility, multi-homing, network- 
level middleboxes, and application-level middleboxes— 
with scant attention to design details or implementations. 
In an attempt to bring some of those nebulous architec- 
tural mutterings into focus, this paper concentrates exclu- 
sively on network-level intermediaries and ignores their 


application-level counterparts.” This paper does not dis- 
cuss mobility and multi-homing scenarios either (though 
DOA, because it separates location and identity, could— 
with modest extensions—handle those scenarios). Given 
our limited focus, DOA should be viewed not as a com- 
prehensive architecture but rather as an architectural 
component designed to address network-layer middle- 
boxes. Its design presumes IPv4 at the network layer but 
DOA is also compatible with, and useful for, IPv6. 

The final limitation we mention is that some peo- 
ple want to deploy tenet-violating middleboxes (e.g., a 
censorious government that silently filters packets en- 
tering and exiting national borders) and that DOA can 
neither prevent such architecturally suspect middleboxes 
nor mitigate their damage. 


2 Background 


This review of common network-layer middleboxes is 
limited to the two we build under DOA—NATs and 
firewalls—and to a subset of their drawbacks; for a com- 
plete review, see [8, 18, 23, 38,55]. Although NAT and 
firewalling are often combined in one box, these two 
functions are logically separate. 


2.1 NAT and NAPT 


Network Address Translation (NAT) and Network Ad- 
dress Port Translation (NAPT) [54] have several uses. 
For the purposes of this paper, we assume the follow- 
ing common scenario: a NAT or NAPT box bridges two 
address realms, at least one of which is private. Private 
addresses are unique within an address realm but am- 
biguous between address realms [46]; public addresses 
are globally unique and reachable from all Internet hosts. 
The hosts in the private realm are said to be behind the 
box. Packets destined for hosts behind the box are said to 
be inbound. The difference between NAT and NAPT is 
that NATs do not look at fields beyond the IP header. We 
adopt the convention of referring to both NAT and NAPT 
as “NAT”, though our description focuses on NAPT (the 
more common of the two today); for simplicity, we as- 
sume that NAPTs have only one external IP address. 
People deploy NATs for two reasons: 


* Convenience and Flexibility: Private addressing 
realms allow people to administer a set of hosts with- 
out having to obtain public IP addresses for each. 


Security: Since hosts behind the NAT do not have 
global identities, a host outside the private realm can- 
not address the hosts in the private realm or express 
that traffic should go to them, which protects them 
from unwanted traffic. Also, by default (i.e., without 
manual configuration), a NAT allows only inbound 


?The basic architectural ideas can be illustrated with network-level 
intermediaries. At the application level, one must consider how appli- 
cations are structured and named, a topic outside this paper’s scope [3]. 


traffic that is part of a connection initiated by a host 
behind the NAT. 


The main operations performed by a NAT are: (1) dy- 
namically allocating a source port at its public IP address 
when a host behind it initiates a TCP connection or sends 
a UDP packet; and (2) rewriting IP address and transport- 
layer port fields to demultiplex inbound packets to the 
hosts behind the NAT and to multiplex outbound pack- 
ets over the same source IP address. NATs violate both 
tenets in §1. First, a NATed host’s conception of its iden- 
tity (namely its IP address) is a private address and thus 
is not a handle that it can pass around to allow other net- 
work entities to reach it. Second, NATs’ modification of 
port fields violates tenet #2. 

NATs cause the following additional problems: 


In order for a server behind a NAT to receive un- 
solicited inbound packets sent to a given destination 
port, one must statically configure the NAT with in- 
structions about packets with that destination port. 
This manual step is called hole punching and requires 
administrative control over the NAT. The amount of 
manual configuration increases when a series of NATs 
separate a server from the public Internet, creating 
a tree of private address spaces*—in this case, one 
must not only configure each of the NATs in the tree 
but also coordinate among them; e.g., each globally 
reachable Web server in a given tree of NATs must get 
traffic on a different port on the outermost NAT’s pub- 
lic IP address. (By outermost, we mean “connected to 
the globally reachable Internet’’.) 


Hosts behind the same NAT cannot simultaneously 
receive traffic sent to the same TCP port number on 
the NAT’s public IP address. However, some applica- 
tions require traffic on a specific port; e.g., IPSEC ex- 
pects traffic on port 500 [44], so only one host within 
a tree of NATs can receive Virtual Private Network 
(VPN) [21] service. 


2.2 Firewalls 


A firewall blocks certain traffic classes on behalf of a host 
by examining IP-, transport-, and sometimes application- 
level fields and then applying a set of “firewall rules”. It 
must be on the topological path between the host and the 
host’s Internet provider, which we argue is unnecessarily 
restrictive. Today’s firewalls disobey tenet #2 because, 
by design, they must inspect many non-IP fields in each 
packet. Since firewalls by default distrust that which they 
do not recognize, they may block novel but benign traffic, 
even if the intended recipient wants to allow the traffic. 


3Such series of NATs are not artificial; see §5.4 and Figure 4. 


3 Architectural Overview of DOA 


This section gives an overview of DOA; we defer design 
details to §4. We first list desired architectural proper- 
ties that aid in gracefully accommodating intermediaries 
and then describe mechanisms to achieve those proper- 
ties. The remainder of the section discusses how DOA 
extends the Internet architecture. 


3.1 Desired Architectural Properties 


(1) Global identifiers in packets: Each packet should 
contain an identifier that unambiguously specifies 
the ultimate destination. The Internet architecture, as 
originally conceived, did provide global identifiers in 
packets, but IPv4 addresses no longer meet the “global 
identifier” requirement. ([Pv6 addresses, because they 
reflect network topology, are also unsuitable for us, as 
we elaborate below.) The purpose of a global identifier is 
to uniquely identify the packet’s ultimate destination to 
intermediaries in a way that is application-independent. 


(2) Delegation as a primitive: Hosts should have 
an application-independent way to express to others that, 
to reach the host, packets should be sent to an interme- 
diary or series of intermediaries. This primitive—called 
delegation—allows end-hosts or their administrators 
to explicitly invoke (and revoke) intermediaries. These 
intermediaries need not be “on the topological path”. 


3.2. Mechanisms 


EIDs: To achieve property (1), each host has an unam- 
biguous endpoint identifier picked from a large names- 
pace. Our design imposes the following additional re- 
quirements: 


(a) The identifier is independent of network topology 
(ruling out IPv6 addresses and other identifiers with 
topology-dependent components, as in [42, 43]). 
With this requirement, hosts can change locations 
while keeping the same identifiers. 


(b) The identifier can carry cryptographic meaning (rul- 
ing out human-friendly DNS names). We explain 
the purpose of this requirement later in this section. 


To satisfy these requirements, we choose flat 160-bit 
endpoint identifiers (EIDs). A DOA header between 
the IP and TCP headers carries source and destination 
EIDs. Transport connections are bound to source and 
destination EIDs (instead of to source and destination 
IP addresses as in the status quo). DOA borrows the 
idea of topology-independent EIDs from previous work, 
including Nimrod [34], HIP [39], UIP [17]. 


EIDs are resolved . . .: DOA provides for delega- 
tion as a primitive by resolving EIDs. We presume a 
mapping service, accessible to Internet hosts, that maps 


EIDs to some target specified by the EID owner. This 
resolution has two flavors: 


e ... to IP addresses: In order to communicate with 
an end-host identified by an EID, a prospective peer 
uses the mapping service to resolve the EID to an IP 
address. This indirection creates a way for a host to 
specify that prospective peers should direct their pack- 
ets to a given delegate: the host has its EID resolve to 
the IP address of the delegate. 


... to other EIDs: More generally, an EID can resolve 
to another EID, allowing an end-host to map its EID to 
a delegate’s identity; if an end-host’s EID had to map 
to the delegate’s IP address (or any other topology- 
dependent identifier), the end-host would have to up- 
date the mapping whenever the delegate’s location 
changed. An EID can also resolve to a sequence of 
EIDs, each of which identifies an intermediary spec- 
ified by the host. This sequence is carried in packets, 
yielding a loose source route in identifier space.* This 
option is reminiscent of i13’s stacked identifiers. 


Thus, our design requires an EID resolution infrastruc- 
ture. We wish the management of this infrastructure to 
be as automated as possible, which is why we had re- 
quirement (b), above: automated management is easier 
if the EIDs are vested with cryptographic meaning [36]. 
The resolution infrastructure must scalably support a 
put()/get() interface over a large, sparse, and flat names- 
pace. Distributed hash tables (DHTs) [2, 14, 49, 62] give 
exactly this capability, but any other technology that of- 
fers this capability would also suffice. DNS’s “resolve- 
your-own-namespace” economic model cannot be used 
here, but there are plausible scenarios for the economic 
viability of a DHT-based resolution infrastructure [61]. 

We have not yet mentioned sender-invoked interme- 
diaries. Under DOA, senders invoke intermediaries by 
putting into packets additional identifiers beyond the 
identifiers that resulted from resolving the receiver’s 
EID. Sender-invoked intermediaries receive little atten- 
tion in this paper but are part of DOA’s design. 


3.3. DOA and the Two Tenets 


We elaborate on our earlier claim that DOA allows in- 
termediaries to abide by the two tenets in §1. Because 
they are location-independent and drawn from a massive 
namespace, EIDs can globally and unambiguously iden- 
tify hosts, satisfying tenet #1. As a result, a network el- 
ement can reply to the source of a packet by sending to 
the location given by the resolution of the source EID. 
To obey network-level layering (tenet #2), network 
elements need only follow normal IP layering rules, as 
follows. If an IP packet arrives at a network element 


4Tn this case, transport connections are bound to the ultimate end- 
point, which is identified by the last EID in the sequence. 


and the packet’s destination IP address is not the net- 
work element’s, then the element may change nothing 
in the packet besides per-hop fields. (However, elements 
may drop packets based on information in the IP header, 
which permits functions such as ingress and egress fil- 
tering.) If, on the other hand, the packet’s destination 
IP address matches the network element’s, there are 
two cases: (1) The destination EID in the DOA header 
matches the network element’s EID (i.e., the packet has 
reached its destination); or (2) These EIDs do not match, 
which means the element is a delegate. In the latter case, 
network-level layering implies that the allowed packet 
operations are up to the entities in the delegation rela- 
tionship. 

Note that this last claim satisfies network-level lay- 
ering but allows violations of higher-level equivalents, 
e.g., an explicitly addressed firewall that looks at appli- 
cation payloads upholds the rules just given but flouts 
application-level layering. In general, this paper claims 
that DOA improves on the status quo by restoring 
network-level layering but does not insist that intermedi- 
aries adhere to higher-level layering. Why not? Higher- 
level layers define how to organize host software, and one 
can imagine splitting host software among boxes using 
exotic decompositions. Defining both higher-level layer- 
ing and an architecture that respects these higher layers 
is a problem that requires care and one we have left to fu- 
ture work. In the meantime, we believe that hosts invok- 
ing intermediaries should decide how best to split func- 
tions between them and their intermediaries. 

We now discuss how the IP layering rules given 
above apply to specific intermediaries. Under DOA, 
NATs, which exist to bridge address realms, need not 
obscure host identity: as we describe in more detail in 
§5, DOA-based NATs may rewrite IP fields but will nei- 
ther touch DOA fields that carry host identities nor over- 
load transport-layer fields. Also, firewalls could be ex- 
plicitly invoked, meaning that packets ending up at the 
firewalls would be addressed to the firewall. While these 
new firewalls (which we cover in §6) could certainly have 
outmoded policies, causing them to drop novel traffic 
classes just as today’s firewalls do, they are not violat- 
ing network-level layering because packets are addressed 
to them. One result of this explicit addressing is that the 
firewall’s invocation is under users’ (or their administra- 
tors’) control, so the user (or administrator) could decide 
to have packets destined for it sent to another firewall, 
one with better suited policies. 


3.4 DOA and Internet Evolvability 


The preceding point is more general than firewalls and 
is important for the Internet’s flexibility and evolvability. 
Today, there is only one easy way to deploy a middlebox: 
putting it “on the path”. Of course, under DOA, some 
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Figure 1: High-level view of DOA design. 


boxes would have to be on the topological path to enforce 
physical security (e.g., for denial-of-service protection); 
§6.4 describes how DOA accommodates these on-path 
boxes. However, DOA—with its flexible and application- 
independent invocation primitive—also gives users or 
their administrators the option to outsource functional- 
ity. Thus, under DOA, fewer intermediaries would need 
to be physically interposed, and users, no longer limited 
to the capabilities of the boxes in front of them, could 
avail themselves of a menu of services. 

As a result, we believe that DOA could permit the rise 
of a competitive market in professionally managed inter- 
mediary services such as firewalls. Delegation and reso- 
lution are precisely what is necessary for such a market— 
the ability for users to select a provider and to switch 
providers. Because users would have a choice, they could 
seek the intermediary service that best suited their needs, 
and because these services would be professionally man- 
aged, they could keep up with the rapid pace of applica- 
tion innovation. Thus, we see DOA as contributing to the 
Internet’s ability to evolve. 

While we believe in its benefits, it is not clear that 
DOA is necessary for these new functions. In fact, we 
conjecture that even for those applications and interme- 
diaries that one can seemingly build only under DOA, 
someone with enough imagination and fortitude could 
achieve equivalent functionality under the status quo— 
but not without running afoul of a basic tenet of the In- 
ternet architecture. We do suspect that the mechanisms of 
DOA will help new Internet functionality to evolve, but 
ultimately we believe our contribution is not novel func- 
tion but rather novel architecture—making a class of net- 
work intermediary functions easier to build and reason 
about, and less likely to cause harm. 

A natural question is how DOA relates to the canoni- 
cal end-to-end argument [51], which is often interpreted 
as a warning against intermediaries. The central claim of 
the end-to-end argument is that application intelligence 
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Figure 2: Example DOA header with no stacked identifiers. 


is best implemented on end-hosts and not “in the net- 
work” because intelligence in the network leads to inflex- 
ibility and because end-hosts know best what functions 
they need. At a high level, DOA upholds this vision: the 
explicit invocation of intermediaries means that intelli- 
gence is not stuck in the network and that end-hosts can 
invoke the intermediaries that best serve them. 


4 Detailed DOA Design 


Given the preceding general description of DOA, we now 
present details of the design. Figure | shows the DOA 
components and the interfaces between them. 


4.1 Header Format 


DOA packets are delivered over IP, with the IP protocol 
field set to a well-known value. An example DOA header, 
with no extensions, is shown in Figure 2; the header 
length is measured in four-byte words, the protocol field 
is the transport-level protocol (e.g., TCP, UDP) used by 
the packet, and the length field gives the DOA packet’s 
total length (including the DOA header but not IP header) 
in bytes. TCP and UDP pseudo-checksums include the 
EIDs where IP addresses are used today (since transport 
logically occurs between two entities, each identified by 
an EID). The DOA header is extensible (e.g., the re- 
mote packet filter presented in §6 extends the basic DOA 
header). 


4.2 Resolution and Invoking Intermediaries 


A DOA host wishing to send a packet to a recipient ob- 
tains the recipient’s EID e out-of-band (e.g., by resolving 
the recipient’s DNS name to e). The sender then uses 
the EID resolution infrastructure—which is discussed 
in §3.2 and which we base on DHTs—to retrieve an 
erecord, depicted in Figure 3. An erecord’s fields are 
as follows: the EID field is the EID being resolved; the 
Target field is described in the next paragraph; the Hint 
field is optional information whose use we illustrate in 
§5; and the TTL field, like DNS’s TTL, is a hint indicat- 
ing how long entities should cache the erecord. DOA 
presumes that EID owners (or administrators acting on 
their behalf) maintain and possibly periodically refresh? 
the DHT’s copy of their erecord. 


5Some DHTs, like OpenDHT [29], store only soft state, requiring 
EID owners to do refreshes. 


EID: 0x345ba4d... 
Target: EID* or IP address 


Hint: e.g., IP address 
TTL: time-to-live, a caching hint 


Figure 3: The erecord. 


Recall from §3.2 that EIDs can either resolve to IP ad- 
dresses (inducing what we call EID-to-IP mappings) or 
to one or more EIDs (inducing EID-to-EID* mappings). 
If the Target field of the erecord contains only an IP 
address i, then, as described in §3.2, the sender simply 
transmits a packet whose destination IP address is i and 
whose destination EID is e. In this case, the EID owner 
may or may not be directing potential senders to a del- 
egate, but the semantics are the same: the EID owner is 
saying “to reach me, send your packet there”. 

If, on the other hand, the Target field of the erecord 
contains one or more EIDs, then the recipient is express- 
ing its wish that the packet transit one or more interme- 
diaries before reaching the recipient. In this case, the se- 
mantics are “to reach me, send your packet to these in- 
termediaries in sequence”. The sender would resolve the 
first EID in the series to an IP address j (perhaps via in- 
termediate resolutions to other series of EIDs, each of 
which would be injected into the original series in the 
logical order) and send the packet to j. This stack of EIDs 
is carried in the DOA header; transport connections are 
bound to the last EID, which identifies the ultimate des- 
tination. (The design, but not our implementation, lets an 
EID resolve to multiple IP addresses; the multiplicity re- 
flects a multi-homed host or an anycast situation in which 
a set of hosts are equivalent for the erecord owner’s pur- 
poses. Similarly, each EID in the Target field could really 
be a set of EIDs, again representing equivalent hosts.) 

To send a packet back to the source, the receiver exe- 
cutes the steps just described to resolve the sender’s EID, 
jf. The receiver cannot simply use the source IP address 
in the original packet as the destination IP address in 
the reply packet because f may resolve to a different IP 
address (e.g., f’s host sends packets directly but wants 
packets to it sent through an intermediary). 

To spare the server the burden of a DHT lookup, the 
client can send its erecord as an optimization. (The 
client may have to send more than one erecord since 
the client’s EID may resolve to a chain of EIDs before 
being resolved to the IP address needed by the server.) 
Also, DOA hosts use the erecord’s TTL to maintain a 
TTL-based cache of EID-to-IP and EID-to-EID* values, 
thus avoiding a DHT lookup for every packet. 

The erecord and accompanying machinery exist to 
support receiver-invoked intermediaries. Senders invoke 
additional intermediaries by pushing the EIDs of the in- 
termediaries onto an identifier stack in the DOA header. 


4.3 Security and Integrity 


Because identities (namely, EIDs) are separate from lo- 
cations (namely, IP addresses), the following require- 
ment arises under DOA: The mapping from a given EID 
to its target must be correct, i.e., either resolving an EID, 
or using an erecord directly sent by a host, must yield 
the IP address intended by the EID owner or by the EID 
owner’s delegates. Specifically, DOA must satisfy the 
following properties: 


1. Anyone fetching an erecord must be able to verify 
that the EID owner created it. 


2. Only the owner of an EID may update the correspond- 


ing erecord in the DHT. 


3. When a host sends its erecord to another host with- 


out using the DHT, the sending host must not be able 
to forge the erecord. 


To uphold these properties, DOA uses self- 
certification [36]: EIDs must be the hash of a public key, 
and the erecord is signed with the corresponding pri- 
vate key. When a host either performs a get() operation 
on the DHT, resulting in an erecord, or else receives 
an erecord directly from a purported EID owner, the 
host must check that the erecord is signed with the 
private key whose corresponding public key was hashed 
to create the EID in question. DHT nodes also perform 
this check before accepting erecords. For more details, 
including how EID owners may update their public keys 
without changing their EIDs, see [61]; we adopt the 
mechanisms described there. 

With the above properties satisfied, erecords cannot 
be forged, but senders can still spoof source EIDs (i.e., 
put the wrong source EID field in the packet). This at- 
tack is like spoofing a source IP address today (except 
that ingress and egress filtering, which help guard against 
IP address spoofing, are not applicable to EIDs): success- 
ful attacks do the same damage, and both attacks are de- 
tectable under two-way communication. For example, if 
a TCP client tries to spoof a source EID to a TCP server, 
when the server looks up the source EID (or uses the 
signed erecord supplied by the client), the server gets 
the correct (not fake) IP address for that EID, so when 
the server replies to the IP address, the host at that ad- 
dress will not complete the 3-way handshake. 

Security of the DHT itself is a topic outside the scope 
of this paper. We briefly observe that DHT nodes cannot 
forge erecords but can return old versions of erecords. 
A way to guard against this attack by consulting multiple 
DHT nodes, instead of one, is mentioned in [14]. 

Also, we note that while IP source routing creates 
security problems, DOA’s loose source routes of EIDs 
do not inherit these difficulties. With IP source routing, 
receivers reverse the source route to reply to a sender, 
which allows an adversary to carry out a man-in-the- 


middle attack by placing its IP address in a forged source 
route. Under DOA, however, hosts do not reverse the 
loose source route to reply to a sender. 


4.4 Host Software 


We now describe the software interface that a production 
DOA deployment would expose. Our prototype imple- 
mentation differs from this description; see §7.1. 

DOA software would run in the kernel and be ex- 
posed to applications with the Berkeley sockets API [37], 
which can extend to EID-based identification. Applica- 
tions would open a new socket type, PF_DOA (in anal- 
ogy with PF_INET, used by today’s IPv4-based appli- 
cations), and pass to the API a new data structure, the 
sockaddr_eid, which holds an EID and port (just as 
the sockaddr_in—which today’s IP-based applications 
use—holds an IP address and port). Some of the socket 
calls, such as connect() and sendto(), might cause the 
DOA software, depending on the state of its caches, 
to issue one or more DHT lookups to resolve the EID 
into potentially intermediate EIDs and also an IP ad- 
dress. One could port today’s applications by substitut- 
ing sockaddr_eid for sockaddr_in in the code, though 
client applications would need additional logic to get a 
server’s EID, perhaps via a DNS lookup. 

For example, client TCP applications would call 
connect(), supplying a sockaddr_eid that contained 
an EID and port, both of which the application had ob- 
tained out of band. Similarly, TCP server applications 
would call accept(), getting back the EID and port of 
the initiating client. To reply to the client, the server’s 
DOA software would resolve the client’s EID to an IP 
address i and address reply packets to i at the IP layer. 

For bootstrapping, DOA hosts would be configured 
with the EIDs and IP addresses of one or more of the 
DHT nodes, in analogy with how today’s hosts learn the 
IP address of a DNS resolver (via hardcoding or DHCP). 
On boot up, the DOA software would insert into the 
DHT the host’s erecord (which could contain an EID- 
to-EID* or EID-to-IP mapping, depending on the host’s 
configuration) and would refresh the mapping periodi- 
cally or in response to host configuration changes. 


4.5 Limitations 


DOA hosts cache erecords, so hosts may have stale in- 
formation about prospective peers. Also, two DOA peers 
in a TCP session resolve each other’s EIDs only once— 
at the start of the session—so hosts cannot change loca- 
tions without breaking existing connections. DOA could 
overcome this limitation if it were extended with a sig- 
naling mechanism, as in [39,53], that allows hosts to no- 
tify peers of IP address changes. Finally, an EID owner 
cannot change which intermediaries are invoked based 
on who is trying to communicate with it. 


5 Network Extension Boxes Under DOA 


This section and the next describe example intermedi- 
aries under DOA. In the next section (§6), our focus is 
on filtering packets and how to move this function “off- 
path”. In this section, we show how the DOA framework 
accommodates boxes that bridge between different IP ad- 
dress spaces and also simplifies the use of these boxes. 
Under the status quo, these boxes are known as NATs 
but would be reincarnated under DOA as tenet-upholding 
Network Extension Boxes (NEBs). 

We first consider three usage scenarios for NEBs 
($5.1), then give our general approach, including a short 
discussion of architectural coherence (§5.2), and then 
discuss the benefits of this approach (§5.3). One of the 
benefits, automatically exposing hosts behind NEBs, is 
particularly useful when NEBs are cascaded (§5.4). We 
present several mechanisms for achieving automatic con- 
figuration ($5.5) and require that they work when there 
are multiple levels of NEB. We conclude the section with 
a discussion (§5.6). 


5.1 Scope 


The following NEB scenarios reflect reasons for deploy- 
ing NATs today (§2.1) and are ordered by the degree of 
access control: 


(a) A host behind the NEB is accessible on all ports. The 
NEB creates a separate addressing realm but does 
not control access. Under this scenario, which cor- 
responds to the “Convenience and Flexibility” reason 
for deploying a NAT today (§2.1), many hosts within 
an organization can be reachable as first-class mem- 
bers of the Internet, even if the organization has only 
one IP address. 


(b) A host behind the NEB is accessible on configured 
ports, and the NEB blocks unsolicited traffic to the 
host on the other ports. This scenario, which reflects 
both reasons for deploying a NAT (§2.1), is analo- 
gous to, e.g., today setting up a Web server behind 
a NAT and configuring the NAT to send all packets 
with destination port 80 to the Web server. 


(c) A host behind the NEB is accessible on no ports, i.e., 
the host can only receive packets associated with con- 
nections it has initiated. This scenario, which is prin- 
cipally driven by the “Security” reason for deploying 
a NAT (82.1), is the default under NAT today. 


We expect that under DOA, scenario (b)—a mix of ac- 
cess control and exposure—would be most common. 
However, for clarity, we focus on scenario (a) and return 
to scenarios (b) and (c) at the end of the section (§5.6). 


5.2 Approach 


NEBs preserve packets’ DOA headers and use the desti- 
nation EID field as a demultiplexing token. For example, 


the NEB could maintain an EID-to-IP table, look up the 
destination EIDs of incoming packets, and then use the 
results of these lookups to rewrite the destination IP ad- 
dresses. There are other ways to demultiplex; we cover 
them in §5.5. 

This approach upholds the two tenets stated earlier. 
Tenet #1 holds because an end-host behind a NEB can 
pass its EID to others, who can then use this handle to 
direct packets to the given host. As mentioned in §3.3, to 
obey network-level layering (tenet #2) NEBs may only 
rewrite fields in a packet if the packet is addressed to the 
NEB. Since NEBs, like today’s NATs, have to rewrite 
both the destination IP addresses of inbound packets (to 
demultiplex them) and the source IP addresses of out- 
bound packets (to make them appear as if they originated 
at the NEB), the discussion in §3.3 implies that both in- 
bound and outbound packets be addressed to the NEB at 
the IP layer. 

However, this approach, in pure form, makes the NEB 
resolve the destination EIDs of outbound packets. As a 
practical matter, sources of outbound packets could do 
the resolution and put the resulting IP address some- 
where in the packet, thereby sparing the NEB this reso- 
lution burden. The source could even put the resulting IP 
address in the destination IP address field; at the IP layer, 
then, outbound packets would look alike under NEB and 
NAT. This modified approach—which technically vio- 
lates the rules in §3.3 but is consistent with the spirit of 
the tenets because the violation is under the control of 
the end-host—is what we adopt. 


5.3 Benefits 


Upholding the two tenets results in the following bene- 
fits, some of which solve the problems stated in $2.1. 

End-to-end communication. Communication is log- 
ically between two EIDs. Thus, protocols can uniquely 
identify hosts. 

Ports are not overloaded. Not using the destination 
port as a demultiplexing token lets multiple hosts behind 
a NEB receive packets sent to the same destination port. 

VPNs. Getting VPNs to work through NATs is cum- 
bersome and complicated [44]. The difficulties under the 
status quo result from NATs rewriting both ports and 
IP addresses. Under DOA, NEBs do not rewrite ports, 
and the state associated with encrypted tunnels could be 
bound to EIDs, not IP addresses.° 

Automatic configuration. Under DOA, the process 
of exposing a host behind a NEB can be automated. 
When NEBs are cascaded, a scenario covered in the next 
section, this automation is particularly useful—and par- 
ticularly problematic under the status quo (§2.1). 


©Much of the HIP work [40] focuses on such binding of IPSEC state 
to cryptographically imbued EIDs. 
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Figure 4: A tree of NATs. 


5.4 Cascaded NEBs 


The scenario of multiple address realms between a given 
host and the rest of the Internet is becoming more fre- 
quent. Consider the following example, depicted in Fig- 
ure 4: an individual runs a virtual host (using, e.g., 
VMWare [60]) that runs behind a NAT on the physi- 
cal host (such NATing of virtual hosts is common). The 
physical host is in turn a member of a home network that 
is all behind a single NAT, which is connected to a broad- 
band provider. The link from the broadband provider, 
owing to the provider’s operations, is itself NATed, mak- 
ing, altogether, three levels of NAT between the virtual 
host and the global Internet. 

We now cover protocols for automatically configuring 
NEBs to expose servers; we require the protocols to work 
when servers are behind multiple levels of NEB. 


5.5 Secure Automatic Configuration 


A protocol for configuring NEBs to expose servers must 
satisfy three requirements. First, the protocol must tell 
the end-host what to put in its erecord since an end- 
host separated from the global Internet by levels of NEB 
has no a priori knowledge about the IP addresses of 
NEBs between that end-host and the Internet. Second, 
the protocol must establish state, either in NEBs or in the 
EID resolution infrastructure, that allows NEBs to use 
the destination EID field in packets as a demultiplexing 
token for rewriting the destination IP address field. 
Third, this state must correspond to the wishes of 
the actual EID owner, rather than of an impostor try- 
ing to divert the EID owner’s traffic. This focus on au- 
thenticity is warranted because passing unprotected pro- 
tocol messages through levels of NEB could be prob- 
lematic. For example, an upstream provider cannot trust 
NEBs administered by its customers, and end-users can- 
not trust each other’s NEBs to correctly propagate con- 
trol or data messages. Also, NEB networks, like today’s 
NATs, would often be constructed over wireless links, 
which are susceptible to eavesdropping and tampering. 
In what follows, we assume that a NEB trusts only the 
NEB directly upstream of it (called its parent); that NEBs 


and end-hosts know the EID of their parent; and that all 
links in the NEB network are vulnerable to eavesdrop- 
ping, tampering, and arbitrary data injection. 

We now give three mechanisms, each using a different 
kind of EID resolution, that meet the requirements above. 
We implemented the third one; see §7.2. 


5.5.1 EID maps to EID 


Each NEB and end-host creates a mapping in the global 
EID resolution infrastructure from its EID to its parent’s 
EID; in other words, NEBs and end-hosts use the dele- 
gation primitive to say, “to reach me, send your packet to 
my parent’s EID”. Also, each NEB holds a mapping from 
its children’s EIDs to its children’s internal IP addresses. 

Control plane. Assume an end-host with EID eg must 
traverse NEBs with EIDs e; through e, before reaching 
the Internet. The end-host inserts a mapping from its EID 
(€o) to its parent’s EID (e1) into the global EID resolu- 
tion service. The end-host also sends a message to e1 
informing it of a mapping between its EID (eo) and its 
IP address (ig). All other internal NEBs in the chain (e; 
through e,_;) use the same protocol. The outermost NEB 
uses the global EID resolution infrastructure to map its 
EID (e,) to its IP address (i,), which is globally reach- 
able. A NEB with EID e;,; should only accept an EID- 
to-IP mapping of the form (e;,i;) if the mapping is au- 
thentic, i.e., if it is signed by the private key correspond- 
ing to e;; performing this check might require e; to send 
éj+1 its public key (which should hash to e ;). 

This approach, as just described, is vulnerable to re- 
plays of (e;,i;) mappings. Such replays would allow the 
wrong end-host—one that is later assigned IP address 
i;—to redirect e;’s traffic to it. We show how one might 
protect against these attacks in §5.5.3. 

Data plane. Assuming the end-host and intermediate 
NEBs all initialize successfully, a remote client can send 
data packets to the end-host (with EID eo) by using the 
EID resolution infrastructure to map eo to e1, e; to e2, 
and so on, up the NEB chain. The last EID lookup maps 
€, to the IP address i,. The client then stacks the iden- 
tifiers eg through e, in its packets and sends the packet 
to IP address i,. Once the packet reaches the outermost 
NEB (e,), the NEB pops the top EID off the stack to 
find that e,_; is the packet’s next hop. The NEB then 
consults its routing table to map EID e,_; to IP address 
in-1, fewrites the packet’s destination IP address to ip-1, 
and forwards the packet. This process continues until the 
packet reaches its eventual destination, eo. 


5.5.2 EID maps to EID and a Hint 


Another approach uses the erecord’s Hint field, men- 
tioned in §4.2, to relieve NEBs of state. 

Control Plane. The end-host inserts into the EID res- 
olution infrastructure a mapping from its EID, eo, to the 
EID, e;, of its parent NEB; the erecord holding this 
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Figure 5: NEB and DHT state after each DOA-RIP round. 


mapping has in its Hint field the end-host’s internal IP 
address, ip. The NEB e; likewise creates a mapping in 
the EID resolution infrastructure from its own EID to the 
EID, e2, of its parent and puts its “outer” IP address, i1, 
into the Hint field of the erecord. This process contin- 
ues until the outermost NEB inserts a mapping from its 
EID, e,, to its “outer” IP address, i,. 

Data plane. A remote host wishing to communicate 
with ep resolves ey to é€1, €; tO €2,...,€n-| to €,, while 
remembering the Hints io,i;,...,i,. AS with the previ- 
ous mechanism, the remote host stacks the identifiers e9 
through e,, in its packets—and in this case also includes 
in the DOA header the IP addresses ig through i,—then 
sends the packet to IP address i,,. Once the packet reaches 
the outermost NEB (e,,), the NEB pops the top EID and 
IP address off the stack to find that e,_; with IP address 
i,-1 1s the packet’s next hop, and the process continues. 


5.5.3 EID maps to IP address 


The previous two mechanisms require a prospective 
sender to do as many EID resolution infrastructure 
lookups as there are levels of NEB. An alternative, that 
we call DOA-RIP, allows senders to do a single resolu- 
tion: from the EID, eo, of the end-host to the IP address, 
in, of the outermost NEB. 

Control plane. End-hosts and NEBs follow a two- 
round protocol, depicted in Figure 5. In the first round, 
the end-host (with EID eg) sends an initialization mes- 
sage to its parent in the NEB tree; intermediate NEBs 
forward the message until it reaches the outermost NEB 
(with EID e,). The outermost NEB creates a message 


Xn = (€nsins’n) (Ty 1S a random nonce to prevent re- 
play attacks), signs x,, and sends it to the NEB with EID 
é,-1. Each NEB e; (k < n), follows suit, appending the 
message x, = (ex, ix, 7k) to X¢41. When the end-host re- 
ceives x;, it verifies the message using e,’s public key. 
This message is a route to the global Internet. 

In the second round, the end-host creates a series of 
requests yy = (€0,ix-1,7%) for 1 < k < n; signs each 
yx individually; concatenates all the y;’s and appends its 
public key; and sends this message up the NEB chain. 
Each NEB e;, verifies y, using e9’s public key and sig- 
nature. Each NEB further checks that r; is in its cache 
and that 7; is the nonce it issued in the first round for 
EID eg (the NEB flushes r; from a cache within a fixed 
number of seconds—10, in our implementation—of is- 
suing r;). If these checks succeed, the NEB flushes 7;, 
establishes a mapping (éo, i,_;), and propagates the re- 
quest up the NEB tree. If all NEBs successfully establish 
the mapping, the end-host inserts into the EID resolution 
infrastructure a map from eg to iy. 

Data plane. To communicate with the end-host, re- 
mote clients first resolve eo to i, and then send packets 
with destination IP address i, and destination EID eo, 
at which point the outermost NEB, and all succeeding 
NEBs in the chain, use their internal state to forward the 
packet to the end-host. 


5.6 Discussion 


Other scenarios. Though we focused on scenario (a) 
(from §5.1), the benefits noted above (in 85.3) apply 
equally to scenario (b). Two of the three mechanisms 
for automatic configuration also apply (the stateless NEB 
from §5.5.2 does not) with the one change that end- 
hosts—when making signed requests of parent or ances- 
tor NEBs to add EID-to-IP mappings—need to add re- 
quests to open (or block) specific ports. This type of au- 
tomatic hole punching works under DOA, in contrast to 
the status quo, for three reasons: (1) DOA has a persistent 
notion of host identity, which allows NEBs to associate 
policies with hosts and remote network entities to iden- 
tify hosts behind the NEB; (2) port fields are not over- 
loaded under DOA, so internal nodes in the NEB tree do 
not have to coordinate among themselves, in contrast to 
the status quo wherein only one server in a tree of NATs 
can receive, e.g., traffic destined to port 80 on the outer- 
most NAT’s public IP address; and (3) hosts can leverage 
the cryptographic properties of their identities to create 
signed messages saying “handle my packets like this”. 
The benefits above, except automatic configuration, 
also apply to scenario (c). Although this scenario is the 
strictest access control NEBs offer, network administra- 
tors may still prefer NATs, since NATs, unlike NEBs, 
obscure the identities of the organizations’ end-hosts. 
Our response is that organizations today use NATs in 


part because they hide internal network topology. Since 
EIDs are independent of network internals, organizations 
might be looser about exposing EIDs than IP addresses. 

Comparison of the mechanisms. Observe that the 
three mechanisms above are different ways to perform 
routing that offer different trade-offs between state held 
in the NEB and the degree of fate-sharing. With one 
of the mechanisms (§5.5.2), all information about EID- 
to-IP mappings is in the EID resolution infrastructure, 
which simultaneously frees the NEB of state but makes 
correct routing depend on the availability of the resolu- 
tion infrastructure. In contrast, DOA-RIP pushes nearly 
all state into the NEBs along the path between two com- 
municating entities. 


6 Network Filtering Boxes Under DOA 


In this section, we demonstrate DOA’s delegation prim- 
itive with a simple remote packet filter (RPF) box that 
yields functionality similar to today’s firewalls but need 
not be interposed between a host receiving firewall ser- 
vice and that host’s link to the Internet. One can certainly 
get similar functionality today with special-purpose ma- 
chinery (e.g., VPN software, though their interfaces dif- 
fer across solution providers). However, we believe that 
decoupling services from topology is best done with ar- 
chitectural, rather than application, support because: (1) 
users should be able to compose intermediaries and (2) 
users should be able to change their delegates easily (see 
§3.4), both of which imply that the architecture support 
a standard, application-independent invocation method. 


6.1 Approach and Design 


The RPF is a basic application of DOA’s mechanisms; 
it is depicted in Figure 6. A user (or representative of 
the user, e.g., corporate IT staff) wanting remote firewall 
service creates a mapping in the EID resolution infras- 
tructure from the end-host’s EID, e, to the RPF’s EID, 
f (or to the RPF’s IP address, but then if that IP ad- 
dress changes, the resolution of e will be incorrect). This 
end-host expresses its actual network location either by 
putting its IP address, 7, in the Hint field of the erecord 
to which e resolves, or by communicating directly with 
the RPF and telling it about the association between e 
and i. (Our implementation, described in §7.3, uses the 
second option.) 

When a sender attempts to contact e, it first looks up 
e in the EID resolution infrastructure, sees that e maps 
to f, and then further resolves f to an IP address (which 
might involve intermediate resolution steps, depending 
on whether the RPF itself has delegates). In the simple 
case in which f resolves directly to an IP address j, the 
sender forms IP packets with destination address j and 
destination EID e. Note that f must be in the stack of 
identifiers since the host given by j may actually be the 
RPF’s delegate rather than the RPF itself (e. g., if the RPF 
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Figure 6: Packet filtering under DOA using delegation. End- 
hosts apply a simple verification rule, not a collection of them. 


were behind a NEB, the NEB would need f’s EID to 
make a decision about the packet). 

When receiving IP packets, the RPF extracts the des- 
tination EID e from the DOA header, looks up the set 
of rules associated with e, and finally applies these rules; 
examples of such rules are filters to block or accept traffic 
based on IP- or transport-layer fields. The result is “pass- 
ing” or “failing” a packet. When packets “fail”, the RPF 
drops the packet. 

The RPF attests that packets “passed” by inserting 
into the packet a MAC (Message Authentication Code) 
taken over the packet; the MAC is keyed with a se- 
cret shared between RPF and end-host. The RPF then 
rewrites the packet’s destination IP address and sends 
the packet to the end-host, which applies a single rule: 
redoing the MAC computation and testing whether the 
result matches the MAC in the packet. The end-host ig- 
nores packets that fail this test; thus, only packets that 
have been vetted by the RPF are processed by the end- 
host’s networking and application software. The MAC 
is carried in a DOA security header, which extends the 
standard DOA header described in §4.1. 

The RPF depends on both of DOA’s core mecha- 
nisms: first, because of unique host identifiers, the RPF 
has a way (namely the destination EID field) to distin- 
guish among hosts, allowing it to apply host-specific 
rules and then send the processed packet to the correct 
destination. Second, the delegation primitive is what al- 
lows the RPF to be invoked in the first place. See §7.3 
and §8.3 for implementation and evaluation details. 


6.2 Benefits 


We first claim two architectural benefits, as discussed in 
§3.3 and §3.4: the RPF described here does not violate 
network-level layering, and also, a market for such ser- 
vices could arise. 

These architectural benefits lead to simplification for 
users. Getting firewall rules right is hard, far beyond or- 


dinary users’ ability, and commercial products (e.g., Nor- 
ton [59]) require users to keep their software current. 
Outsourcing per-packet rules to a central provider solves 
those problems. Of course, end-hosts still have to check 
packets, but the check—“was this packet vetted by my 
RPF provider?”—is considerably simpler than the usual 
complement of firewall rules. 


6.3 Limitations 


The box just described is primitive. For it to provide the 
same functions as today’s firewalls—such as using exist- 
ing TCP connections, and not just stateless filtering rules, 
to make decisions—protected end-hosts would have to 
direct their outbound traffic through the RPF. These end- 
hosts would use the mechanism of sender-invoked inter- 
mediation (§4.2). 


6.4 Physical Security 


Some organizations require that every inbound and out- 
bound packet be vetted by a box that is physically inter- 
posed between the organization and its link to the Inter- 
net. We briefly describe two scenarios for such on-path 
boxes under DOA. 

We start with an on-path vetter that works with an off- 
path RPF. As above, an end-host within the organization, 
h, creates a mapping in the EID resolution infrastructure 
from its EID, e, to the RPF’s EID. In this case, however, 
h tells the RPF that after the RPF processes packets des- 
tined to e, it should send them to the vetter’s IP address 
(instead of to h’s IP address, as above). The vetter allows 
packets into the organization only if they are addressed to 
it at the IP layer and if the MAC check succeeds, thereby 
ensuring that the RPF has checked every packet enter- 
ing the organization. The vetter uses the destination EID 
field to forward vetted packets to the correct host. 

Some organizations will of course not want an RPF, 
preferring to deploy an on-path firewall and manage the 
rules itself. DOA supports an on-path firewall just as it 
does an off-path firewall: the organization’s hosts map 
their EIDs to the EID or IP address of the on-path fire- 
wall. Since this setup is functionally the same as today’s 
on-path firewalls that are not explicitly invoked at the IP 
layer, one might wonder what DOA accomplishes here. 
The answer is uniformity: in this setup, the configuration 
of end-hosts is independent of the firewall’s placement. 
Thus, administrators can later move the firewall off-path 
without reconfiguring every host in the organization. 


7 Implementation 

We describe our implementation of end-host DOA soft- 
ware, a NEB prototype, and an RPF prototype. 

7.1 End-Host DOA Software 


In a production deployment, DOA software would be 
part of kernel protocol stacks, as in §4.4. However, we 


1. RPC Query: GetHostByEID(<EID> 


packet 
sender 


userspace 


kernel 


Standard Linux 8 
Protocol Stack 


7. IP Packets Sent to Kernel <Opaque IP Pair> 
8. Routed through Click 


Click Packet Rewriter 


Real IP 


1.1.1.1 Ox8a6f3c... 65.43.2.1 
1.1.1.2 OxOde56e...{ 98.76.5.4 


2. DHT RPC: Lookup(<Dst EID>) 

3. DHT RPC: Reply: <Real Dst IP> 

4. doad->Click: Put(<EID, Real IP>) 
sp. Click->doad: Reply: <Opaque IP> 


Se 
. 


9. DOA Packets 


DOA Control Path ------ > 


DOA Data Path > 


Figure 7: The control and data paths in our prototype implementation of DOA. This figure depicts sending a packet to a given 
EID obtained out-of-band. Events are numbered in chronological order. 


wanted to understand DOA’s properties before commit- 
ting to full kernel implementations, and so we prototyped 
using a combination of user-level software and Click [31] 
modules inside the Linux 2.4.20 kernel. Beyond a patch 
required by Click, we did not modify the kernel. Figure 7 
depicts our implementation of end-host DOA software. 

Applications get EIDs via user input or DNS and re- 
solve them by invoking the GetHostByEID RPC, ex- 
posed by doad, which is a user-level daemon written in 
C++ using the SFS libasync library [35]. doad imple- 
ments the RPC by first querying OpenDHT [29] (the key 
is the EID, e; the returned value is the IP address, 1, that e 
resolves to) and then, via Click’s /proc file system inter- 
face, telling the Click “rewriter” module about the map- 
ping (e, 7). doad returns an opaque handle in the 1.0.0.0/8 
subnet that the application uses when the sockets API ex- 
pects an IP address; the opaque handles allow us to reuse 
much of the kernel’s IPv4, TCP, and UDP software. (Ap- 
plications could use EIDs instead of the opaque handles 
if the kernel’s networking software were extended to use 
the sockaddr_eid structure, as described in §4.4.) The 
rewriter module receives IP packets with addresses in the 
1.0.0.0/8 subnet, maps these opaque handles to EIDs and 
real IP addresses and then transmits bona fide DOA traf- 
fic. We did not implement EID-to-EID* mappings. 


7.2 NEB Prototype 


We implemented (1) a NEB prototype in a Click module 
and (2) DOA-RIP (§5.5.3) in user space. The NEB has 
an EID-to-IP table which it uses to rewrite destination IP 
addresses and which gets an entry when a host behind 
the NEB, possibly separated by several other NEBs, runs 
DOA-RIP. After DOA-RIP completes and the NEBs, 
which also run DOA-RIP, have correct state, the host uses 
doad’s interface to OpenDHT to insert a mapping from 
the host’s EID to the outermost NEB’s external IP ad- 
dress, thereby making the host globally reachable. 


7.3 RPF Prototype 


The RPF is (1) a Click module that associates EIDs to a 
set of simple rules that are together applied (with OR or 
AND) to IP-, DOA-, and transport-layer fields to make 
a “pass” or “fail” decision for each packet and (2) user- 
level software that communicates with end-hosts, first, 
to establish a secret key for each EID (using an en- 
crypted, MACed control channel given by the SFS [36] 
toolkit) and, second, to process requests to add, change, 
or remove rules. End-host RPF users run (1) a MAC- 
checking Click module that injects into the kernel’s net- 
working stack only those packets that have been correctly 
MACed by the RPF and (2) user-level software to com- 
municate with the RPF, as just described. The RPF uses 
HMAC [32] and, in our prototype, it is taken over packet 
headers, only. 


8 Evaluation 


The architectural coherence afforded by DOA comes 
at a performance cost. This section characterizes that 
cost with microbenchmarks that measure the latency, 
throughput, and processing time overhead of DOA- 
enabled data transfers. 


8.1 Round-trip Times and Hops 

Compared to the status quo, DOA adds network round- 
trip times. DOA requires an extra resolution—of the 
EID—when a host first sends a packet to another host. 
For applications whose end-to-end latency is dominated 
by DNS lookups (such as Web browsing), the effect of 
these resolutions might be particularly pronounced. In 
the most basic DOA configuration—two end-hosts com- 
municating with no off-path intermediaries—the con- 
necting client makes two synchronous network calls: (1) 
a DNS request to map a human-readable hostname to an 
EID and (2) a DHT lookup to map a server’s EID to its 
IP address. A third synchronous DHT lookup is required 
for the server to resolve the client’s EID to an IP address. 


A recent study [26] indicates that median DNS lookups 
from a network at MIT can vary from about 70 ms in the 
case of NS server cache hits, to about 190 ms in the case 
of cache misses. By contrast, we measured median DHT 
lookups of random EIDs stored in OpenDHT at 138 ms. 
Thus, DOA can add noticeable delays to small data trans- 
fers, sometimes tripling their end-to-end latencies. 

However, for latency-sensitive hosts and applications, 
the following optimizations are possible: 


¢ The DHT could use Beehive’s [45] proactive, model- 
driven caching strategy to reduce the number of net- 
work round-trips required by lookups to an average of 
one or less than one (assuming the request pattern for 
EIDs is heavy-tailed). 


DNS names of hosts could resolve to the EID and the 
erecord (or to the chain of erecords that together 
indicate how to reach the host), thereby requiring one 
DNS lookup, as under the status quo, to send a packet 
to a host. In this case, DNS itself would be caching 
erecords. 


To save a remote host the burden of an EID resolution 
when responding to an initiating host, the initiating 
host could send its erecord (as noted in §4.2). 


In addition, DOA adds network hops: when a packet 
travels from a source to an off-path middlebox en route to 
a destination, the packet (in most cases) takes more net- 
work hops than if it had traveled from source to destina- 
tion directly. This extra latency is inevitable if one wants 
to invoke an off-path intermediary. There is no “correct” 
trade-off between latency and the flexibility of off-path 
functions; different users have different preferences. 


8.2 Packet Size Overhead 


DOA packets in our implementation have a 68-byte DOA 
header (the 44 bytes shown in Figure 2 plus 24 for the 
DOA security header mentioned in §6.1). This over- 
head affects the maximum number of packets per sec- 
ond sustainable by DOA senders and receivers and is 
more costly for smaller packet payloads. For example, 
adding a DOA header to a 1466-byte UDP-over-IP-over- 
Ethernet packet (the UDP payload here is 1400 bytes) 
increases the packet size by 4.6%. For 130-byte packets 
(with UDP payload of 64 bytes), the 68 bytes of DOA 
header add overhead of more than 50%. 

For large packets, this overhead is likely to bottleneck 
DOA’s sustainable throughput. To verify this claim, we 
now characterize the throughput our DOA implementa- 
tion can sustain when sending and receiving large pack- 
ets. We measure both DOA and non-DOA traffic and find 
that the packet header overhead introduced by DOA ex- 
plains the throughput difference between the two cases. 
Each measurement below is the average of five trials in- 
volving | GB of data, and the average packet drop rate 


[Component [ eyelet | a/R] 
poaS_|_18942 | Tn 


Pailter—__|_94103_[ 5.34 
verify __||_87738 [5.16 


Table 1: Processing time per packet for DOA components. The 
first column contains the number of cycles, while the second 
column contains the calculated time (in ys) needed to perform 
that number of cycles on an Intel Celeron 1.7 GHz processor. 


was less than 1%. Our experiments do not involve the 
DHT; prior to the experiments, we resolved the destina- 
tion EIDs to IP addresses. 

To get a baseline, we measured the number of UDP 
packets per second that one of our test hosts can send an- 
other. We tested large packets (1400-byte payloads) over 
a Gigabit Ethernet network, tuned the sending rate to 
achieve maximum throughput, and measured the number 
of packets that exited the receiver’s device driver queue. 
On average, the receiver processed 72900 packets per 
second, or 778.5 Mbit/s. The bottleneck here appears to 
be our hosts’ PCI buses. 

Next, we ran the same test with DOA packets. The 
sender uses our end-host DOA software (§7.1), which in- 
serts a DOA header into IP packets and rewrites IP head- 
ers. The receiver performs a similar process to translate 
DOA packets to IP packets. In this case, the receiver pro- 
cessed 69600 packets per second, or 743.0 Mbit/s, which 
is 4.6% slower than the baseline. We conclude that the 
slowdown here is due entirely to packet size overhead. 
These tests were for large packets; as noted above, small 
packets will be much more penalized by this overhead. 


8.3 Processing Time 


For small packets, however, in addition to packet size 
overhead, CPU costs for per-packet DOA operations will 
limit the rate at which DOA hosts can process packets. 
We now characterize this potential bottleneck on small 
packets (64-byte UDP payloads). Using Click tools to 
read the processor’s cycle counter before and after our 
DOA modules, we estimated the number of cycles used 
by the modules; the reported numbers are averages over 
80000 processed packets. Note that these numbers are 
upper bounds on average processing time: the imple- 
mentation is untuned, and our measurements include cy- 
cles consumed by interrupt handling for other kernel pro- 
cesses. Table 1 summarizes our observations. 

We first measured the processing time needed by our 
receiver’s DOA software, which translates DOA packets 
to IP packets (labeled “DOA IP” in Table 1). This com- 
ponent takes nearly 1900 cycles—or 1.11 ys on the host 
we used for testing, which has an Intel Celeron 1.7 GHz 
CPU—to process each packet. 

We next considered the processing time for the oper- 
ations associated with an RPF (§6), namely HMAC [32] 


computation and verification. In our experiments, the 
RPF operates as described in §7.3; it uses a single default 
rule that passes all packets intended for the receiver and 
holds a mapping from the receiver’s EID to an IP address. 
The RPF computes the MAC for each packet, writes the 
MAC to the DOA header, and forwards the packet to 
the receiver. When the receiver gets the packet, it ver- 
ifies the MAC before passing the packet to its end-host 
DOA software. As shown in Table 1, the filter takes more 
than 9400 cycles (5.54 ys) to apply its rule and compute 
a MAC, and the receiver takes nearly 8800 cycles (5.16 
US) to verify the MAC. 


8.4 Discussion 


Of the three types of costs imposed by DOA—lookup la- 
tency, packet size, CPU overhead—the latter two would 
only appear to users under the most stressful system con- 
ditions. The first cost, latency, is serious because, ab- 
sent optimizations, it is visible to end-users. However, 
as noted in §8.1, there are several caching strategies that 
substantially mitigate, if not eliminate, this latency. Ul- 
timately, we believe that the costs imposed by DOA are 
outweighed by the benefits of a coherent framework for 
reasoning about, and deploying, intermediaries. 


9 Related Work 


Besides the direct influence of 13 [57], HIP [39-41], 
and UIP [17] on our mechanisms and insights, an older 
proposal for location-independent EIDs [34] grew out 
of Nimrod [9]. Shoch [52] and Saltzer [50] have been 
among many (see [10, 11, 13, 19, 33, 42, 43] and refer- 
ences therein) to distinguish between network elements’ 
identity and location. Indeed, much of what we mention 
below separates these two concepts, usually by creating a 
set of end-host identifiers distinct from network location. 
i3 canonizes this separation with an infrastructure 
that uses flat identifiers in packets to decouple sending 
(into the infrastructure) and receiving (from the infras- 
tructure). These identifiers name services whereas EIDs 
name hosts. Like DOA, i3 is specifically designed for 
senders and receivers to invoke intermediaries. 13 does 
not hold the proliferation of private addressing realms 
as a principal concern, but one can leverage i3 to reach 
machines behind NATs without modifying or configur- 
ing the NATs [28]. The main difference is that the DOA 
architecture requires a resolution infrastructure while 13 
depends on a forwarding infrastructure; under the pure 
design, all 13 packets are sent into the infrastructure. 
TRIAD [22] is an extension to the Internet that ad- 
dresses many architectural ills, including NAT. TRIAD 
hosts receive location-independent names. As in DOA, 
these names may resolve to a logical path, and IPv4 ad- 
dresses are routing tags without end-to-end significance. 
TRIAD does not focus on a framework for network-layer 
middleboxes, though its mechanisms can certainly ac- 


commodate them, and the authors give a solution for 
NAT traversal. The technical details of our approaches 
differ: TRIAD names are derived from domain names (in 
contrast to flat EIDs), and under TRIAD, resolution and 
routing are conflated, thereby improving latency. 

HIP also separates location and identity; its goal is ar- 
chitectural support for mobility and multi-homing. DOA 
borrows some of HIP’s mechanisms and applies them to 
middlebox issues, which is not HIP’s focus. 

In contrast, some work is expressly motivated by the 
proliferation of private addressing realms. UIP [17], from 
which we also borrow, creates an overlay among partici- 
pating hosts to interconnect heterogeneous or NATed net- 
works. Like DOA, UIP incorporates HIP-style flat host 
identifiers. UIP hosts form an ad-hoc overlay by using a 
DHT-inspired algorithm to route packets for each other 
based on the destination identifier. Our approach con- 
trasts with UIP’s in that, while both projects address mid- 
dleboxes’ proliferation, we focus on an architecture that 
explicitly welcomes middleboxes whereas UIP’s over- 
lay of peers makes them transparent. IPNL [20] is an 
extension to the Internet architecture that solves prob- 
lems resulting from private addressing realms. IPNL re- 
lies in part on bona fide host identifiers; these identifiers 
are domain names, though the authors acknowledge the 
security benefits of HIP-style flat identifiers. Like DOA, 
IPNL tries to coherently incorporate NATs into the In- 
ternet architecture, and both designs modify hosts and 
NATs but not IPv4 routers. 

Other projects have tried to obsolete middleboxes; 
these run the gamut from architectural enhancements 
to radical reorganizations. An example in the middle 
of this spectrum is IPv6 [15]. IPv6 addresses are glob- 
ally unique (thus addressing one motivation for NATs), 
but, as noted in §3.2, do not satisfy the requirement 
of topology-independence. Predicate Routing [47] and 
network capabilities [1] propose architectural enhance- 
ments for security and denial-of-service protection. Rad- 
ical network architectures and meta-architectures include 
Role-Based Network Architecture [7] and FARA [10]. 
Our approach contrasts with these because, first, our goal 
is explicit invitation of middleboxes and, second, these 
proposals, if fully realized, require at least some changes 
to all network elements, not just hosts and middleboxes. 

In contrast, other work has avoided creating identifiers 
for end-hosts but has nonetheless accepted middleboxes 
as an architectural problem to be worked around. MID- 
COM [56,58] is a protocol and framework intended to 
remove intelligence from NATs and firewalls by offload- 
ing application-specific behavior to designated agents, 
which insert dynamic state into intermediaries automat- 
ically. For example, in response to a globally reachable 
host initiating an Internet telephony session to a NATed 
host, the agent would ask the NAT to open the appro- 


priate destination UDP port and would close the port at 
session’s end. Like DOA, MIDCOM aims to simplify 
management of NATs and firewalls by creating state au- 
tomatically. However, because MIDCOM focuses only 
on application- and not networking- and transport-level 
software, persistent host identifiers are unavailable to 
them, and thus their protocols devote considerable en- 
ergy (and complexity) to handling the overloading of port 
fields. Also, MIDCOM’s techniques work through only 
one layer of NAT [56] in contrast to our supposition that 
hosts may be behind several layers. 

Twice NAT [55], Realm Specific IP [6,55], and 
STUN [48] all address specific problems posed by NATs. 
A recent Internet draft [18] summarizes various tech- 
niques by which P2P applications can handle middle- 
boxes. While useful for the current network architec- 
ture, these (largely manual) tactics for exposing NATed 
hosts would be unnecessary if all hosts had location- 
independent identifiers. Today, many home users attempt 
to create persistent identifiers for frequently renumbered 
hosts with Dynamic DNS, e.g., [16]. Since DNS names 
are resolved to IP addresses and are not carried in pack- 
ets, they are quite useful as naming handles for humans 
but not for network elements. 

DOA’s use of the delegation primitive to simplify fire- 
walls is preceded by a body of literature that addresses 
the error-prone and time-consuming nature of firewall 
configuration. The Firmato toolkit [4], for example, takes 
a language-based approach to simplifying firewall con- 
figuration by abstracting away low-level configuration 
details. Distributed firewalls [5,25,30] take simplification 
one step further: a centralized, managed entity down- 
loads firewall rules to end-hosts (which it identifies with 
IPSEC certificates in analogy with our use of EIDs to 
associate policies with hosts). In contrast to the approach 
described in §6, distributed firewalls do not off-load from 
clients the job of actually applying rules. 


10 Conclusion 


The Internet architecture was defined in a context where 
traffic was benign and addresses plentiful. There were 
no reasons to interpose functions other than forwarding 
between endpoints, which became the end-to-end rally- 
ing cry for the architecture. Today’s Internet is a very 
different place. There are many reasons why users inter- 
pose functions that, in the canonical architecture, either 
belonged on their host (such as firewalls) or didn’t be- 
long at all (such as NATs). The Internet architecture was 
not designed for—in fact, one might say it was designed 
against—such interposition of function. 

The current incarnations of interposition, middle- 
boxes, are widely derided for their violations of the ar- 
chitecture and the resultant loss of flexibility in the In- 
ternet. However, the complexity and risk associated with 


being a network host, which used to be minimal, is now 
daunting even to expert users. We therefore expect out- 
sourcing functionality to become increasingly common. 

The architecture presented in this paper, DOA, has a 
simple goal: to allow the Internet to reap the benefits of 
network-level middleboxes without their harmful side- 
effects. It does so not by altering IP, or routers, but by 
making delegation a basic primitive and introducing a set 
of globally unique endpoint identifiers. 
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